{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Running many simulations in parallel using OpenMP"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Note: you may need to restart the kernel to use updated packages.\n"
     ]
    }
   ],
   "source": [
    "%pip install \"pybamm[plot,cite]\" -q    # install PyBaMM if it is not installed\n",
    "import time\n",
    "\n",
    "import numpy as np\n",
    "\n",
    "import pybamm"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Workflows such as parameter sweeps require running many identical simulations with different input parameters. PyBaMM provides a way to run many simulations in parallel using OpenMP using the IDAKLU solver. Looking at the [API docs](https://docs.pybamm.org/en/stable/source/api/solvers/idaklu_solver.html#pybamm.IDAKLUSolver) for this solver, we see a `num_threads` argument that can be used to set the number of threads to use when creating the solver, like so:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {},
   "outputs": [],
   "source": [
    "solver = pybamm.IDAKLUSolver(options={\"num_threads\": 2})"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "This option will have no effect if you try to solve a single simulation, but if use an input parameter in your model and pass in a list of values for that parameter, PyBaMM will automatically distribute the simulations across the number of threads you specify. For example,"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "[<pybamm.solvers.solution.Solution at 0x7d2a1418dfc0>,\n",
       " <pybamm.solvers.solution.Solution at 0x7d2a1418e170>]"
      ]
     },
     "execution_count": 3,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "model = pybamm.lithium_ion.DFN()\n",
    "params = model.default_parameter_values\n",
    "params[\"Current function [A]\"] = \"[input]\"\n",
    "sim = pybamm.Simulation(model, parameter_values=params, solver=solver)\n",
    "sim.solve(\n",
    "    [0, 3600], inputs=[{\"Current function [A]\": 1}, {\"Current function [A]\": 0.5}]\n",
    ")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The previous code will then run the 2 simulations in parallel on 2 threads. Since we are solving such a small number of simulations, the overhead of parallelization will not be worth it and it is likely that this will be slower than solving in serial. Below we show an example of solving 1000 simulations in parallel, varying the number of threads used to see the effect on the runtime."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Time taken to solve 1000 DFN simulation for 32 threads: 2.65 s\n",
      "Time taken to solve 1000 DFN simulation for 16 threads: 3.72 s\n",
      "Time taken to solve 1000 DFN simulation for 8 threads: 6.07 s\n",
      "Time taken to solve 1000 DFN simulation for 4 threads: 10.11 s\n",
      "Time taken to solve 1000 DFN simulation for 2 threads: 17.73 s\n",
      "Time taken to solve 1000 DFN simulation for 1 threads: 26.46 s\n"
     ]
    }
   ],
   "source": [
    "n = 1e3\n",
    "current_inputs = [\n",
    "    {\"Current function [A]\": current} for current in np.linspace(0, 0.6, int(n))\n",
    "]\n",
    "num_threads_list = [1, 2, 4, 8, 16, 32]\n",
    "for num_threads in reversed(num_threads_list):\n",
    "    model = pybamm.lithium_ion.DFN()\n",
    "    params = model.default_parameter_values\n",
    "    params.update(\n",
    "        {\n",
    "            \"Current function [A]\": \"[input]\",\n",
    "        }\n",
    "    )\n",
    "    solver = pybamm.IDAKLUSolver(options={\"num_threads\": num_threads})\n",
    "    sim = pybamm.Simulation(model, solver=solver, parameter_values=params)\n",
    "    start_time = time.perf_counter()\n",
    "    sol = sim.solve([0, 3600], inputs=current_inputs)\n",
    "    end_time = time.perf_counter()\n",
    "    print(\n",
    "        f\"Time taken to solve 1000 DFN simulation for {num_threads} threads: {end_time - start_time:.2f} s\"\n",
    "    )"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "You can see that the speed-up from using more threads starts to diminish after a certain point (about 20-30 threads), and this effect will be more pronounced for smaller numbers of simulations, or if the simulations are very quick to solve (e.g. an SPM model). Below we show the same example, but this time using the SPM model."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Time taken to solve 1000 SPM simulation for 32 threads: 0.42 s\n",
      "Time taken to solve 1000 SPM simulation for 16 threads: 0.70 s\n",
      "Time taken to solve 1000 SPM simulation for 8 threads: 0.43 s\n",
      "Time taken to solve 1000 SPM simulation for 4 threads: 0.53 s\n",
      "Time taken to solve 1000 SPM simulation for 2 threads: 0.90 s\n",
      "Time taken to solve 1000 SPM simulation for 1 threads: 0.86 s\n"
     ]
    }
   ],
   "source": [
    "n = 1e3\n",
    "current_inputs = [\n",
    "    {\"Current function [A]\": current} for current in np.linspace(0, 0.6, int(n))\n",
    "]\n",
    "num_threads_list = [1, 2, 4, 8, 16, 32]\n",
    "for num_threads in reversed(num_threads_list):\n",
    "    model = pybamm.lithium_ion.SPM()\n",
    "    params = model.default_parameter_values\n",
    "    params.update(\n",
    "        {\n",
    "            \"Current function [A]\": \"[input]\",\n",
    "        }\n",
    "    )\n",
    "    solver = pybamm.IDAKLUSolver(options={\"num_threads\": num_threads})\n",
    "    sim = pybamm.Simulation(model, solver=solver, parameter_values=params)\n",
    "    start_time = time.perf_counter()\n",
    "    sol = sim.solve([0, 3600], inputs=current_inputs)\n",
    "    end_time = time.perf_counter()\n",
    "    print(\n",
    "        f\"Time taken to solve 1000 SPM simulation for {num_threads} threads: {end_time - start_time:.2f} s\"\n",
    "    )"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "So in this case the speed-up for using multiple threads to solve 1000 SPM simulations is much less than for the DFN simulations, and above 4 threads no speed-up is observed at all."
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "env",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.10.12"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}
